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Four rats’ choices between two levers were differentially reinforced using a runs-test algorithm. On each 
trial, a runs-test score was calculated based on the last 20 choices. In Experiment 1 , the onset of stimulus 
lights cued when the runs score was smaller than criterion. Following cuing, the correct choice was 
occasionally reinforced with food, and the incorrect choice resulted in a blackout. Results indicated that 
this contingency reduced sequential dependencies among successive choice responses. With one 
exception, subjects’ choice rule was well described as biased coin flipping. In Experiment 2, cuing was 
removed and the reinforcement criterion was changed to a percentile score based on the last 20 
reinforced responses. The results replicated those of Experiment 1 in successfully eliminating first-order 
dependencies in all subjects. For 2 subjects, choice allocation was approximately consistent with 
nonbiased coin flipping. These results suggest that sequential dependencies may be a function of 
reinforcement contingency. 
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The variability of a series of responses, 
distributed between some alternatives such as 
left (L) and right (R) levers, has been defined 
in terms of two properties from the concept of 
randomness (Neuringer, 2002). First, variabil- 
ity is high if each member of a set is as 
frequent (overall) as any other member of the 
set, that is, the relative frequencies (or 
probabilities) of different response alterna- 
tives are similar, as in a uniform probability 
distribution. Second, variability is high if the 
relative frequencies of all higher-order sequen- 
tial combinations, such as dyads, triads, etc. are 
also (over the long run) equal. The former 
implies a property of equiprobability, and the 
latter implies that of sequential independence. 

Previous research aimed at producing highly 
variable performance has used reinforcement 
contingencies that are based on the relative 
frequencies of the response alternatives. In 
most studies, these contingencies have in- 
volved frequency-dependent selection. For 
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example, Page and Neuringer (1985) rein- 
forced responses when they had not occurred 
in the last N trials, whereas Machado (1992) 
reduced reinforcer likelihood when the fre- 
quency of a response increased. These and 
other studies (Blough, 1966; Bryant & Church, 
1974; Denney & Neuringer, 1998; Machado, 
1989; Pryor, Haag, O’Reilly, 1969; Schoenfeld, 
Harris, & Farmer, 1966; Shimp, 1967) all 
reinforced response alternatives that had a 
low (or zero) frequency in the recent past. 

In many experiments, a single trial consisted 
of the emission of a response unit, defined by 
the reinforcement contingency, comprising a 
four-response sequence of binary choices, such 
as left (L) and right (R) responses. When 
observed probabilities of the 16 (2 4 ) possible 
response combinations (e.g., RLRR) were 
equal, the behavior was deemed to have 
maximum variability. By definition, any bias 
in the frequency distribution of the alterna- 
tives indicates reduced variability, and exclu- 
sive emission of any particular sequence 
constitutes minimal variability. Thus, such 
studies were concerned chiefly with the rela- 
tive frequencies of response alternatives. They 
attempted to control response bias by rein- 
forcing response distributions that exhibit 
maximum dispersion (Abreu-Rodrigues, Lat- 
tal, dos Santos, & Matos, 2005; Cherot, Jones, 
& Neuringer, 1996; Cohen, Neuringer, & 
Rhodes, 1990; Denney & Neuringer, 1998; 
Doughty & Lattal, 2001; Machado, 1989; 
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McElroy & Neuringer, 1990; Miller & Neur- 
inger, 2000; Mook, Jeffrey, & Neuringer, 1993; 
Morgan & Neuringer, 1990; Morris, 1987, 
1989, 1990; Neuringer, 1991, 1992, 1993; 
Neuringer, Deiss, & Imig, 2000; Neuringer & 
Huntley, 1991; Odum, Ward, Barnes, & Burke, 
2006; van Hest, van Haaren, & van de Poll, 
1989). 

Frequency-dependent reinforcement can be 
used to create sequential independence as well 
as equiprobability, although it may require a 
set of more than eight response alternatives. 
Machado (1992, 1993) systematically investi- 
gated the necessary and sufficient conditions 
of random-like performance. Using a set of 
two response alternatives (L, R) as targets of a 
frequency-dependent selection, he found pi- 
geons had a significant tendency to alternate 

responses: LRLRLR Next, using sequences 

involving two successive responses as targets 
(LL, LR, RL, RR), some, but not all, pigeons 
performed double alternation patterns suc- 
cessfully; however, when he used all possible 
combinations of three-response sequences to 
define target sets (i.e., LLL, LLR, LRL, LRR, 
RLL, RLR, RRL, RRR), then all pigeons 
performed randomly. The results suggest that 
the last procedure suffices to engender ran- 
dom-like behavior in that all of the possible 
response sequences have the same strength. If 
all are equiprobable, then sequential depen- 
dencies cannot be present. 

It is, however, important to underscore that 
sequential independence can be achieved 
even when individual response alternatives 
are not equally probable (Nickerson, 2002). 
To illustrate our rationale, consider a case 
involving two mutually exclusive events, such 
as heads (H) or tails (T) in a coin toss. An 
alternation pattern of HTHTHT... shows that 
the H and T are equiprobable, thereby 
meeting one standard of randomness; howev- 
er, it fails a second standard of unpredictability 
because event order is perfectly predictable 
based on first order conditional probability. 
Conversely, sequential independence among 
events H and T is possible when these two 
events are not equiprobable [e.g., p(T) > 
p(H), as when a coin is biased], but their 
conditional probabilities may reveal indepen- 
dence of a coin’s head and tail [i.e., p(HIT) = 
p(H) and p(TIH) = p(T)]. In a relevant 
experiment, Machado (1994) used frequency- 
dependent selection to shape molar response 


proportions toward various equilibrium values 
between 0 and 1, and examined sequential 
dependencies in local response sequences. 
The procedure successfully altered molar 
response proportions, and at extreme values, 
local performance fell midway between biased 
randomness (sequential independence) and 
stable sequences (which imply successive de- 
pendence). That is, when molar response 
proportions deviated from .5, stable local 
patterns that were present at .5 broke down, 
although not to the extent that they con- 
formed to biased coin flipping. 

A more direct approach to controlling 
sequential dependencies might be more suc- 
cessful in achieving sequential independence, 
and hence, random-like behavior. One aim of 
our research is to present an approach based 
on the idea that run-length frequencies can 
serve as a basis for contingent reinforcement. 
Using such a contingency, we can ask whether 
reinforcement of certain run-length distribu- 
tions, expected from a putatively “random” 
source, leads to random-like behavior. To 
ensure that a reinforcement contingency 
targets sequential dependency per se, the 
procedure must have an impact on the 
sequential dependency of interest but leave 
the relative frequencies of responses unaffect- 
ed. That is, the ideal procedure must separate 
the influence on sequential dependency from 
any influence on relative frequencies of 
responses. The procedure we developed here 
is derived from the runs-test algorithm for 
randomness from Siegel (1956). A run is 
defined as an uninterrupted sequence of 
identical elements delimited by different 
elements. The number of runs in a sequence 
equals the number of response alternations 
plus one. Generally, when the observed num- 
ber of runs is significantly different from the 
expected number of runs, calculated accord- 
ing to overall response proportion, the runs 
test rejects the null hypothesis that the 
sequence was independent. Plainly, when 
alternation occurs either too infrequently or 
too frequently in the sequence, this sequence 
is regarded as including a certain regular 
pattern, and the null hypothesis will likely be 
rejected. 

Our procedure reinforced, on each trial, a L 
or R response possessing a score smaller than 
the critical value of the runs test. When the 
symbol K represents an observed number of 
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runs, the expected number of runs ( K ) and its 
variance (c-^ 2 ) were computed according to 
the following equations: 


K = 


2n R n L 
n R + n L 


( 1 ) 


ff _ 2 _ 2n R n L (2n R n L - n R - n L ) 

( n R + n L )~(n R + n L - 1) 

in which n R and n L represent the number of R 
and L responses, respectively, in a sample 
sequence. Then, the runs-test score, .S', was 
calculated as follows: 


S = 


K-K 



( 3 ) 


In Experiment 1, we introduced the new 
reinforcement contingency in a modest way, 
that is, stimulus lights above levers were used 
as a conditioned reinforcer, because a previous 
study demonstrated that the effect of a 
contingency on behavioral variability was 
stronger under conditioned reinforcement 
(Cherot et al., 1996), and was maintained in 
a delayed-reinforcement situation (Odum et 
al., 2006; Wagner & Neuringer, 2006) . Accord- 
ingly, stimulus lights were illuminated in 
Experiment 1 when a subject’s performance 
fell within the criterion range, and a primary 
reinforcer was provided with p = .1 in that 
state. Next, in Experiment 2, we removed the 
conditioned reinforcers and examined the 
effect of direct reinforcement with a more 
sophisticated experimental design. 


When n R and n L are large, the distribution 
approaches the normal distribution and 
S (Equation 3) is a normal unit variable 
(hence the familiar value of ±1.96 for alpha 
= .05). We discuss the relation between the 
distribution and our procedure further in the 
General Discussion. 

Our procedure used an algorithm that 
calculated S (from the last 20 responses) every 
time a response was emitted, and compared it 
with a critical value to determine whether 
reinforcement would be delivered. With a 
fixed sample size of 20, we needed only two 
parameters for calculation: the proportion of 
emitted responses [p(L) = 1 — p(R)], and 
the number of runs. We initially set two 
critical boundary values for S, ±1.96. Over 20 
responses, comprising Rs and Ls, observed S 
values that fell within these boundaries were 
eligible for reinforcement. Note that within 
wide limits, the use of a runs-test score does 
not require any given proportion of L and R 
responses for reinforcement. For example, 
suppose n R and n L were 4 and 16, respec- 
tively [p(L) = .8], and K was 4. In this case, 
the score would be —2.52, the null hypoth- 
esis would be rejected, and reinforcement 
would not be given for the last response. 
With the same frequencies for L and R but 
with K = 6, however, the score is —1.04 and 
is eligible for reinforcement. As this case 
illustrates, subjects could satisfy the contin- 
gency even if the response proportion was 
quite strongly biased. 


EXPERIMENT 1 

In Experiment 1, we examined the effect of 
the runs-test contingency with a conditioned 
reinforcer. We reinforced responses that pro- 
duced S scores within a required range, but 
with low probability (.1). To help establish 
responding that met criteria for sequential 
independence, we used stimulus lights as a 
conditioned reinforcer. Two stimulus lights, 
one above each of two levers, were illuminated 
when the score of the runs test was within a 
criterion range, whereas they were extin- 
guished when the score was outside this range. 
Thus, if a response occurred that met the runs 
criterion, and the stimulus lights were off, then 
stimulus lights were turned on. If the lights 
were already on, then they remained on for as 
long as successive responses continued to meet 
the criterion. If the lights were on and the 
response did not meet criterion, then they 
were turned off. If the lights were already off 
and the response did not meet the criterion, 
they remained off. 

Reinforcement occurred only for those re- 
sponses that met the stipulated runs criteria. 
Thus, responses that initiated or maintained 
illumination (i.e. lights on) sometimes received 
primary reinforcement. Although the aim was 
to extinguish responses that did not meet the 
runs criterion, it was necessary to reinforce 
some of these responses early in the experi- 
ment in order to prevent complete extinction 
in subjects that exhibited low behavioral vari- 
ability. Accordingly, responses that maintained 
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the lights in the off state did receive some 
reinforcement at the beginning of this exper- 
iment, but the frequency of this reinforcement 
was lower than for criterial responses. 

Method 

Subjects 

Four male Wistar rats were maintained at 
approximately 80% of their free-feeding body 
weights. Water and sawdust were continuously 
available in their home cages where a 12-hr 
light-dark cycle was in effect. At the beginning 
of the experiment, two 46-week-old subjects 
(Rat 1 and Rat 3) had previous experience with 
variability reinforcement schedules; one 48- 
week-old subject (Rat 5) only had experience 
with lever-press training; and the 4th subject 
(Rat 9) , which was 32 weeks old, had experience 
under a concurrent-chains schedule. 

Apparatus 

The experimental chamber was 210 mm 
long by 280 mm wide by 270 mm high, and was 
enclosed in a sound-dampening box. The 
chamber had a ceiling and side walls con- 
structed of Plexiglas and front and back walls 
of metal. The front wall contained two 
shielded stimulus lights (white 28-V bulbs), 
120 mm above the floor and 100 mm apart. 
Two response levers, requiring a force of 0.15 
N to operate, were located 70 mm above the 
floor and 80 mm apart measured center to 
center. A pellet tray that received 45-mg food 
pellets was centered between the levers 20 mm 
above the floor. A shielded houselight (28-V 
bulb) was on top of the back wall. A speaker 
for presenting white noise and a ventilating 
fan were attached on the outer box. All 
experimental devices were controlled and 
monitored by a MED-PC version 2.0 system. 

Procedure 

Because all rats had previous experimental 
experience, they were placed immediately in 
the runs-test procedure. A session consisted of 
440 trials per day, and a trial consisted of a 
single response, T or R. Responses could occur 
freely except that each one turned off the 
houselight for 0.2 s, during which further 
responses had no effect. 

After the first 20 responses of the session, 
each response yielded an S score. If the 
absolute value of the runs-test score fell within 


stipulated boundaries, shown as the unshaded 
cells in Figure 1, then stimulus lights were 
turned on and a food pellet was delivered with 
p = .1. At the beginning of the experiment, 
none of the animals met the criterion. For 
responses that maintained a lights-off state, 
responses were reinforced also with p = .1 if 
the current score was closer to zero than the 
two previous scores (for responses that turned 
off the light, this condition could not be met). 

The criterion for receipt of a food pellet 
became stricter as training progressed. In the 
first experimental condition the critical value 
on the runs test was set to I ±1.961 and the 
training continued until performances be- 
came stable. After performance attained sta- 
bility, food delivery on light-off trials was 
terminated. Then, in the second condition 
the critical value was changed from I ±1.961 to 
I ±1.391, and the training continued until 
performances became stable. 

Sessions continued until the relative fre- 
quencies of R responses and the number of 
alternations were judged to be stable under 
the following criterion: the last nine sessions 
were divided into three blocks and the largest 
difference between the medians of the three 
blocks was within 15% of the average of the last 
nine sessions. 

Data Analysis 

Dealing with the sequential dependencies in 
behavioral variability, the Markov chain model 
is appropriate (see Machado, 1997). With our 
contingency, we expect to observe an in- 
creased frequency of intermediate numbers 
of runs according to the proportion of L and 
R, that is, no first-order dependency. The S- 
values of the runs test is of limited value here 
because it does not show whether there are 
higher-order dependencies. Accordingly, an 
additional analysis is needed to examine 
sequential dependencies in greater detail. 

There are several methods of tracking the 
phenomenon, including the use of chi-square 
goodness-of-fit tests, likelihood ratio tests and 
an approach based on information theory. 
Although these indices are related to each 
other, and there is little to choose among 
them for statistical analysis, the estimated 
values of mutual uncertainties provide a valuable 
visual aid to complement the significance tests 
which depend on the validity of the chi-square 
approximation (Attneave, 1959; Chatfield, 
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Response Proportion 



Fig. 1. All possible scores on the runs test in the number of sample = 20, calculated from Equation 3. Response 
proportion is n x / (n x + n y ), in which n x is the less-chosen response alternative. White cells signify data within the +/— 
1.96 criterial range, while grey cells fall outside this range. 
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1973; Chatfield & Lemon, 1970; Miller & 
Frick, 1949; Pincus & Singer, 1996). Using 
these values, we can track the changes in 
performance as training progressed. We use 
the mutual uncertainties (Ts) from informa- 
tion theory as follows: 

Tx = Hx — H 2 , for first order dependency, 
T 2 = H 2 - H 3 , for second order dependency, 
and 

T 3 = H 3 — H |. for third order dependency, 

where Hj = -£pi log 2 p ; ; 

H s = “Z P(i> j) lo g2 P(h j)+ ZPi lo S2 Pf 
H 3 = “Z P(h j, k) log2 p(i, j, k) + Zp(hj) log 2 

p(i, j); and 

H 4 = -Z P(i> j> k > !) !og2 p(i, j, k, 1) + Z p(i,j> 
k) log2 p(i, j, k), 

where i, j, k, 1 are arbitrary successive responses 
in a session. We transform Ts into chi-square 
statistics for observing the variation of estimat- 
ed values of mutual uncertainties, verifying the 
statistical test at one time. The chi-square form 
is as follows: 

Chi- = 2 log e 2 N T m , 

df = c m-1 (c — l) 2 , where N is the length of 
trial per session, and c is the number of 
instances, that is, left or right response. The 
subscript nr reflects the order of a dependen- 
cy, therefore, m is the value we test. Using 
these indices, we observe the change of 
sequential dependencies. 

In addition to mutual uncertainties, we 
utilize a lag analysis to examine the obtained 
response patterns (Machado, 1992, 1993, 
1994) . If X n is the response in trial n; then p 
(X n+k = R I X n = R) is the probability of a right 
response in trial n + k, given a right response in 
the current trial n. The lag analysis plots p (X n+k 
= R I X n = R) against k, the lag value. Strong 
deviation from the probability at lag 0 displays 
sequential dependencies. For example, with 
perfect alternation (RLRLR. ..), lag 0 is the 
probability .5, lag 1 is 0, lag 2 is 1.0, lag 3 is 0, 
and lag 4 is 1.0. When there are no sequential 
patterns, all lags approximate the lag 0 value. 

Results and Discussion 

Because the first 20 trials in the sessions were 
stored as samples for calculations and were 


unaffected by the contingencies of reinforce- 
ment, we used the data from the last 420 trials 
per session to: (1) assess run structure; and (2) 
examine sequential dependencies. 

Runs Analyses 

At every lever press, a runs test score, S, was 
produced. Figure 2 plots proportions of the S 
scores whose absolute values were smaller than 
1, between 1 and 2, and larger than 2, in each 
session. In the sessions before the vertical 
dashed line (Area A) , additional food deliver- 
ies occurred when the stimulus lights turned 
off. Sessions after this line (Area B) had no 
additional food deliveries. In the sessions after 
the vertical solid line (Area C), the critical 
value was changed from 11.961 to 11.391. 

At the beginning of Experiment 1, all subjects 
showed low proportions of S scores in the range 
— 1 to 1 . Subjects 1 and 9 showed increases after 
only a few sessions. Subject 3 initially showed a 
large proportion of 5 scores whose absolute 
values were greater than 2 (ineligible for 
reinforcement). These decreased, and the 
proportion between 1 and 2 increased, with 
further training. Subject 5 showed little differ- 
entiation of .S' scores. After removing additional 
food deliveries, subjects’ performances deterio- 
rated temporarily. When the criterial region 
narrowed to 11.391, the performance of all 
subjects improved in that the proportion of S 
values in the range — 1 to 1 increased, and more 
extreme S values decreased, although these 
changes were small for Subject 9. 

If the rats responded perfectly according to 
the reinforcement contingency, all responses 
in a session would produce S scores in the 
prescribed range and illuminate the cue lights. 
Figure 3 plots the proportion of responses that 
illuminated the cue lights, and hence were 
eligible for primary reinforcement. Except for 
Subject 9, whose performance was consistently 
close to 1.0 after the first few sessions, 
performances became more and more eligible 
for reinforcement with extended exposure to 
the contingency. Therefore, the results indi- 
cate that differential reinforcement by the 
runs-test criterion can modify the subjects’ 
performances. 

Analyses of Sequential Dependencies 

Runs data alone cannot provide complete 
evidence for sequential dependencies. Accord- 
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Subjectl Subject3 



SESSIONS 


Fig. 2. Proportions of absolute values of 5 in three ranges: smaller than 1, between 1 and 2, and larger than 2, in each 
session of Experiment 1. In Area A, the criterial range for S scores was +/— 1.96, and there were additional food 
deliveries when the stimulus lights were off. In Area B, there was no additional food. In Area C, the criterial range was 
reduced to +/— 1.39. 


ingly, we did not employ the runs test as a 
statistical test and instead, we relied on mutual 
uncertainties. This approach permitted us to 
examine sequential dependencies in much 
greater detail. We examined the way subjects 
adapted the contingency, that is, whether they 
developed high-order dependencies as first- 
order dependency decreased, or whether se- 
quential dependencies were removed altogether. 

Figure 4 plots mutual uncertainties, T m , for 
m = 1,2, and 3 (Equation 9). Each column in 
Figure 4 shows a chi-square value associated, 


respectively, with T, (first order), T 2 (second 
order), and T 3 (third order) sequential de- 
pendencies for each of the 65, 114, 111, and 
57 sessions, respectively, for each subject. Note 
the degree of sequential dependency cannot 
be an all-or-none phenomenon; it is necessar- 
ily a continuum. This is true even after chi- 
square transformation. Horizontal lines indi- 
cate 5% critical chi-square values. Observed 
chi-square values below the horizontal lines 
indicate performance that exhibits no sequen- 
tial dependency. Sessions prior to the point 
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Fig. 3. Proportions of responses that illuminated the cue lights in each session of Experiment 1 . Areas A, B. and C are 
the same as for Figure 2. 


indicated by a vertical line had additional food 
deliveries with stimulus lights off. These 
indices are useful for investigating the trends 
of the sequential-dependency data. Compar- 
ing panels horizontally within subjects, the 
lowest order tends to show the highest level of 
dependency. Although large values in Tj were 
generated in the first sessions, for all subjects 
Ti decreased below the critical value as the 
training progressed. Subjects 1, 3, and 5 
approximated independence at all T m al- 
though after initially achieving sequential 
independence, Subject 1 developed a slight 
first-order dependency towards the end of the 


experiment. Subject 9 continued to show 
higher-order dependencies throughout. 

A lag analysis was conducted to examine the 
obtained response patterns. Figure 5 shows 
results from lag zero to lag 6 in the first seven 
sessions of Condition 11.961, and in the last 
seven sessions in Condition 11.391. Only the lag 
profiles of right responses are shown. The 
profiles of left responses had a similar tenden- 
cy. Horizontal solid lines indicate uncondi- 
tioned probability, that is, lag zero values, in 
each session. If there were no sequential 
dependencies, all lag values would be similar 
to lag zero values. 
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Fig. 4. Mutual uncertainties in Experiment 1 . The difference (T m ) between successive uncertainty indices (H m and 
H m+1 ) for each subject for each order of sequential dependency. Horizontal lines indicate critical values for chi square. 
Data points below the critical value represent no significant difference between H m and H m+1 . See text for calculations of 
Hs, Ts and transformations to chi square. 


In the first seven sessions Subjects 1, 3, and 5 
show stable and consistent tendencies of 
repetition, like RR or RRR, but Subjects 3 
and 5 do not show the same tendencies in the 
last seven sessions. This means that performance 
of these subjects approximated sequential inde- 
pendence. The lag profile of Subject 1 in the last 


seven sessions showed a simple alternation 
pattern, RL. Subject 9 showed the pattern RLR 
in first two sessions, which changed over the 
course of three to seven sessions (RLL, RLLR). 
Its lag profiles seemed to be similar in pattern in 
the last seven sessions of Condition 11.391; 
however, note that the lag-1 probability approx- 
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FIRST 7 SESSIONS in Crit.ll.96l LAST 7 SESSIONS in Crit.ll.39l 

Fig. 5. Conditional probability profiles of right responses for the first seven sessions with the criterion set at +/— 1.96 
and the last seven sessions with it set at +/— 1.39, in Experiment 1. Each set of seven connected points, Lags 0 to 6, 
correspond to one session. The horizontal dotted line represents p of .5. The first point of each profile is the relative 
frequency of right responses (R/ (R+L) ) . The next four points are the conditional probabilities at each lag. 


imated that of lag zero. In other words, the first- 
order dependency disappeared. 

Because the lag zero probability coincided 
with that of its elementary components (L or 
R), lag zero also indicates response biases in 
emitting L and R alternatives. In the first seven 
sessions, most subjects revealed no striking 
biases. However, in the last seven sessions, 
some subjects showed a distinct bias for the left 
lever (see Subjects 1 and 3). 

Finally, Figure 6 plots the relative frequen- 
cies of four-response sequences as units. Solid 
lines show the expected values, calculated from 


the relative frequencies of quadruplets of 
instances (Jensen, Miller, & Neuringer, 2006). 
For example, when p( R) = .25 and p(I7) = .75, 
/.(IT J .T.) = .75 X .75 X .75 X .75 = .316 and 
/(LRLR) = .75 X .25 X .75 X .25 = .035. These 
are expected from a stochastic process. The first 
column in Figure 6 shows that subjects’ perfor- 
mances deviated from the expected distribu- 
tion during the first session of the experiment. 
However the middle and right columns show 
that their performances changed, and for 
Subjects 1, 3, and 5, approximated the expected 
distribution. That is, what 3 of 4 subjects were 
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Fig. 6. The relative distributions of four-response units in the first session of Experiment 1 , the last session with the 
criterion set at +/— 1.96, and the last session with it set at +/— 1.39, arranged in successive columns. Lines are expected 
values from randomness. 


effectively doing was approximately random. 
The characteristic of Subject 9’s performance 
was alternation pattern, that is, LLRL, LRLR, 
LRLL, RLLR, RLRL was emitted frequently. 


This experiment was designed to demonstrate 
a new technique for controlling behavioral 
variability, using a runs-test criterion. Generally, 
first-order dependency, that is, T , in uncertainty 
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indices, was controlled well in all subjects. In 
addition, results showed that Subjects 1, 3 and 5 
achieved sequentially independent behavior by 
successfully excluding several orders (T 1; T 2 , 
T 3 ); however, one (Subject 9) maintained 
higher order dependency. 

As discussed earlier, the runs test gauges the 
number of runs observed in a performance 
relative to the expected number. Because the 
production of a run depends on whether 
subjects repeat or alternate a response emitted 
on the preceding trial, our runs-test algorithm 
affected the level of repetition and alternation, 
that is, first order dependency. The level of 
repetition and alternation relates directly to the 
first-order dependency, because both describe 
the relation between responding on one trial 
and that on the preceding trial. Therefore, our 
procedure was successful in eliminating a first- 
order sequential dependency, in spite of the 
fact that higher-order dependencies were evi- 
dent in Subject 9’s profile. 

Having achieved sequential independence 
under the runs-test contingency, Subject 1 later 
developed first-order dependency. This is trivial 
because the relative distribution of four-re- 
sponse units showed that its behavior closely 
approximated the expected distribution. We 
believe that it was the result of an extreme bias 
(.05:. 95) toward one of the two responses. For 
example, one sequence consisted of 10 consec- 
utive Ls, one R, and nine Ls (i.e., LLLLLLLLLL 
RLLLLLLLLL); this yields a runs score of 0.33, 
based on three runs. Such an outcome can occur 
if the less frequent response (e.g., here R) is not 
first or last in a series. By contrast, a sequence 
consisting of nine Ls, two Rs, and nine Ls (i.e., 
LLLLLLLLLRRLLLLLLLLL) yields a score of 
— 2.28, which is outside the criterial range. In the 
case of extreme bias, the subject has to emit only 
one response to the less-preferred lever and 
return to more-preferred lever. The results of 
the lag analysis were consistent with this 
prediction. It was possible that subjects could 
learn to use the light-off as a cue for switching to 
the less-chosen lever. However, only 1 rat 
(Subject 1) developed this and only after much 
training, suggesting that such an usual discrim- 
ination is generally difficult to acquire. 

EXPERIMENT 2 

In Experiment 2, we modified the proce- 
dure in several ways. First, to make the effects 


of the runs-test contingency clearer, we trained 
subjects in a standard concurrent schedule for 
several sessions before introducing the runs- 
test contingency. Second, we no longer illumi- 
nated the stimulus lights. If subjects had used 
them as a discriminative stimulus in Experi- 
ment 1, then this would permit them to emit 
different patterns of responses, respectively, in 
conditions with lights on versus off. Such a 
discrimination may have contaminated the 
effect of differential reinforcement. Third, we 
held the probability of reinforcement con- 
stant. Many studies indicate that behavioral 
variability is influenced by variation of rein- 
forcement frequency (Boren, Moershbaecher, 
& Whyte, 1978; Gharib, Derby, & Roberts, 
2001; Gharib, Gade, & Roberts, 2004; Grunow 
& Neuringer, 2002; Tatam, Wanchisen, & 
Hineline, 1993). In Experiment 1, it is possible 
that the change from less frequent to more 
frequent reinforcement, rather than the runs- 
test contingency, was responsible for the 
development of sequential independence. By 
keeping reinforcement probability constant in 
Experiment 2, we eliminated this factor as a 
source of sequential independence. 

Finally, in order to hold the probability of 
reinforcement constant, we also adjusted the 
runs-test criterion. Instead of using criterial 
test values, such as 1.96 and 1.39, we relied 
upon a percentile criterion (see Alleman & 
Platt, 1973; Galbicka, 1988, 1994; Machado, 
1989) . After each response, the current S score 
was compared against the scores in the last 19 
trials. A food pellet was delivered with proba- 
bility 2/ 3 if the current score was closer to zero 
than at least 17 of the previous 19 scores. This 
procedure can hold the probability of rein- 
forcement constant. 

Method 

Subjects 

Four male Wistar rats (Subjects A, B, C, D) 
were maintained at approximately 80% of their 
free-feeding body weights. They were experi- 
mental naive and 40 weeks old at the start of the 
experiment. Water and sawdust were continu- 
ously available in their home cages where a 12- 
hr light-dark cycle was in effect. 

Apparatus 

The apparatus was the same as in Experi- 
ment 1 except all experimental devices were 
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controlled by a computer using Visual Basic 
2005 Express Edition software. 

Procedure 

After subjects were trained to press the lever 
by hand shaping, they were exposed to a 
continuous-reinforcement schedule, which 
provided 100 food deliveries per session. 
Either the left or right lever provided rein- 
forcement in a given session, and the reinforc- 
ing lever was switched after each training 
session. After a few sessions, when all subjects 
pressed both levers reliably, two-lever training 
was initiated. In this procedure, a reinforcer 
was assigned probabilistically to a particular 
lever. No further assignments were made until 
the reinforcer was delivered (Stubbs & Pliskoff, 
1969). In the baseline, reinforcers were allo- 
cated equally often for left and right respons- 
es. Each session ended after 500 responses. 
The probability of reinforcement was de- 
creased gradually from 1.0 to .1. Once the 
reinforcement probability had been reduced 
to .1, it remained at that level until perfor- 
mances stabilized. It is against this baseline 
that we compare the results from the runs-test 
phase, which was run next. Both the baseline 
phase and the runs-test phase had the same 
probability of reinforcement, but the baseline 
phase had no runs-test contingency. 

In the runs-test contingency phase, the score 
on each trial was compared against the 
previous 19. If the current one was closer to 
zero than at least 17 of previous 19 scores, then 
a reinforcer was delivered with p = .667. Once 
the runs test score reached criterion, several 
trials would be likely to deliver a reinforcer in 
some cases. Except for the absence of stimulus 
lights, the remaining procedures and analyses 
were the same as in Experiment 1. 

Results and Discussion 

Again we examine the runs structure of 
subjects’ behavior first, and then the data on 
sequential dependencies among successive 
responses. 

Runs Analyses 

Figure 7 plots proportions of S scores whose 
absolute values were smaller than 1, between 1 
and 2, and larger than 2, in each session. The 
sessions before the vertical line are from the 
baseline phase, where the probability of 


reinforcement was . 1 , whereas those after the 
vertical line indicate differential reinforce- 
ment by the runs-test phase with the same 
probability. In the baseline phase, Subject A 
showed similar proportions of .S’ scores smaller 
than 1 and between 1 and 2. Only Subject D 
showed an increase in the proportion that 
were smaller than 1. On transition to the test 
phase, all subjects improved their proportions 
in this range. Scores for Subjects B and C 
improved rapidly, while Subject A improved 
gradually. Comparing the last five sessions 
between baseline and the runs test phases, all 
subjects improved their scores. Thus, Figure 7 
reveals that in Experiment 2, as in Experiment 
1, behavior of all subjects was sensitive to the 
runs test contingency. 

Sequential Dependency Analyses 

Mutual uncertainties are plotted in Figure 8 
for the last five sessions. Results from both 
baseline and the runs-test phases are shown, 
separated by a vertical line. Successive columns 
give chi-square values of T 1; T 2 , and T 3 . 
Horizontal lines indicate 5% critical values of 
the chi square; values below the horizontal 
lines indicate that performance showed no 
sequential dependency. The first column fl j ) 
shows that except for Subject D, first-order 
sequential dependency was present in base- 
line, but this decreased under the runs-test 
contingency. Columns for T 2 and T 3 show that 
sequential independence was achieved in the 
higher orders for Subjects A and D, whereas 
some dependencies remained in Subjects B 
and C. These results are in broad agreement 
with those of Experiment 1. 

Figure 9 presents a lag analysis for the last 
five sessions of both phases. Lag profiles 
showed all subjects favored some response 
sequence patterns in the baseline phase. 
Typical patterns were RR (Subjects B and C), 
or RRL (Subjects A and D). However, in the 
runs-test phase, such patterns gradually disap- 
peared. For all subjects lag-1 probability was 
similar to lag 0, that is, the first order 
dependency disappeared. Moreover, Subjects 
A and D showed almost no pattern. Subject B 
retained the same pattern as in baseline, 
although it became less conspicuous, and 
Subject C tended to emit L in Lag 2. In 
comparing these data with lag data of Exper- 
iment 1, we see that these subjects exhibit no 
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Fig. 7. Proportions of absolute values of S in three ranges: smaller than 1 , between 1 and 2, and larger than 2 in each 
session of Experiment 2. The area before the vertical line is baseline and that after it is the runs-test phase. 


biases for either other lever; instead, response 
probabilities were near .5. 

Finally, Figure 10 plots the relative frequen- 
cies of four-response sequences as units. At the 
start of the experiment (left column), all 
subjects tended to repeat responses, that is, 
LLLL and RRRR are high. Through baseline 
sessions, their performance was modified 
somewhat. By the end of baseline training 
(middle column), for all 4 rats, a common 
pattern is evident in that the frequency 
of double-alternation pattern — LLRR and 
RRLL — increased, and high alternation pat- 
terns— LLRL, LRLL, l.RI.R, RLRL, RLRR — 
remained low. This pattern was lost by the end 


of the runs-test phase (right column), and 
profiles approximated the expected values 
derived by assuming randomness. 

The results in Experiment 2 replicated those 
of Experiment 1 . All subjects were susceptible 
to a reinforcement contingency that used the 
runs-test algorithm (Figure 7). In addition to 
showing their sensitivity to this contingency, 
subjects’ performance came to eliminate se- 
quential dependencies (Figure 8). This ten- 
dency was not different between Experiments 
1 and 2 in spite of the fact that conditioned 
reinforcers were removed and primary rein- 
forcement was more strictly controlled in the 
latter. 
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Fig. 8. Mutual uncertainties of last five sessions under each phase of Experiment 2. Horizontal lines indicate critical 
values for chi square. Data points below the critical value represent no significant difference between H m and H m+1 . See 
text for calculations of Hs, Ts and transformations to chi square. 


Our differential reinforcement procedures 
were designed to have no effect on response 
bias. Subjects in Experiment 1 showed a strong 
bias to the left lever (Figure 5) whereas in 
Experiment 2 they showed almost no bias. In 
consequence, they attained uniform distribu- 
tion of choice between response alternatives 
(Figures 9 and 10). Thus, our results showed 
we could control variability, producing a 


sequentially independent pattern, regardless 
of whatever bias existed; it was not a byproduct 
of differentially reinforcing equiprobable out- 
comes. 

GENERAL DISCUSSION 

The present work aimed to demonstrate a 
new reinforcement contingency based on run- 
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BASELINE PHASE RUNS TEST PHASE 

LAST 5 SESSIONS 

Fig. 9. Conditional probability profiles of right responses for the last five sessions in the baseline and runs test phases 
of Experiment 2. Each set of seven connected points, Lags 0 to 6, correspond to one session. The horizontal dotted line 
represents p of .5. The first point of each profile is the relative frequency of right responses (R/(R+L)). The next four 
points are the conditional probabilities at each lag. 

structure analyses of successive responses in a 
choice task. By using the runs-test algorithm as 
a criterion for differential reinforcement, we 
show that first-order response dependencies 
can be successfully removed. Higher-order 
dependencies were sometimes present early 
in training also, and these were often reduced 
with extended exposure to the contingency. 

Thus, the new contingency appeared to be 
effective in modifying the structure of re- 
sponse runs in almost all subjects. 


A possible criticism involves our use of the 
runs test. This test was designed as a test for 
randomness. Equation 3 is appropriate for 
cases where at least one of the response 
alternatives occurred more than 20 times, that 
is, for large numbers (Siegel, 1956), whereas in 
our experiment, the sum of both response 
alternatives is 20. However, we used the runs 
test not as a statistical test for randomness, but 
rather as a criterion for differential reinforce- 
ment. Thus, the issue becomes whether or not 
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Fig. 10. The relative distributions of four-response units. The left column is for the first session of Experiment 2, the 
middle column is for the last session of baseline, and the right column is for the last session of the runs-test phase. Lines 
are expected values from randomness. 


our conclusions about the effects of contin- 
gency are reliable in this context. To assess 
this, we relied upon a nonparametric method, 
for which Siegel (1956) and Swed and Eisen- 
hart (1943) prepared tables of expected runs 


based on small samples. These tables provided 
appropriate critical values in the case of small 
samples. Thus, if we compare data in Figure 1 , 
calculated from Equation 3, with test-score 
statistics for this nonparametric test, the latter 
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decreases the risk of Type 1 error (i.e., 
rejecting a true null hypothesis of no depen- 
dency) , whereas it increases the risk of Type 2 
error. In other words, our use penalizes Type 2 
errors more than predicted by the nonpara- 
metric test tables. In effect, this means we may 
have imposed a more severe criterion than 
required by the runs test. This possibility does 
not present a problem for our conclusions. 
Rather we note that the procedure for 
differential reinforcement requires a sample 
size that is not so large as to dilute the 
differential nature of the contingency (Alle- 
man & Platt, 1973; Galbicka, 1988, 1994). 

Our procedure involved an interlocking 
schedule with which two experimental dimen- 
sions (K, response proportion) are related. In 
previous investigations, either the proportion 
of responses to an alternative, or the number 
of runs, has been used as the basis for 
differential reinforcement (Bryant & Church, 
1974; Machado, 1997; Neuringer, 1986). By 
contrast, we attempted to combine these 
dimensions and to contrive a procedure of 
differential reinforcement for sequential de- 
pendencies. It was different from differential 
reinforcement of response alternatives with 
lower frequency in that it permits one 
response alternative to have high frequency. 
However, performance approached an equi- 
probable state and some subjects performed 
randomly in Experiment 2. Such findings 
suggest there may be various procedures that 
will yield highly variable or random behavior. 
If so, it remains to be determined what the 
necessary and sufficient conditions are for 
producing this behavior. 

We note two different views on reinforced 
sequential dependencies, according to differ- 
ent epistemological attitudes, that is, molar 
and molecular. From the molar standpoint, 
molar behavioral phenomena, say, allocations 
of behavior, response rates, and behavioral 
variability, are regarded as individuals or 
concrete particulars, as species were (Baum, 
2002; Glenn & Field, 1994). From the molec- 
ular standpoint, such phenomena are regard- 
ed as abstractions or derived things. Glenn 
(2003) discussed them from the analogy of 
organic evolutionary theory, in which Maynard 
Smith (1994) characterized the increases in 
complexity during evolution of the organic 
world as resulting from a succession of 
processes that became possible only when a 


previous level of complexity had been reached. 
With behavior, complex behavioral phenome- 
na are regarded as a result of repeated rounds 
of selection acting on phenomena resulting 
from earlier rounds of selection. If we regard 
the phenomena as derived things, we would 
seek the cause of variation of the behavioral 
variability in earlier rounds of selection. On 
the other hand, if we regard them as concrete 
particulars, we would focus on the effect of the 
behavioral phenomena at the higher-complex- 
ity level. With behavioral variability, Machado 
(1992, 1997) claimed that dispersion of 
response alternatives might have been a 
derivative of more fundamental processes. 
This claim is reasonable because the process 
of differential reinforcement of response 
alternatives with lower frequency produced 
the behavioral variability. On the other hand, 
some researchers focused on the effect of 
variation and repetition as a concrete particu- 
lar in choice, delayed reinforcement, resis- 
tance to change, and so on (Abreu-Rodrigues 
et al., 2005; Doughty & Lattal, 2001; Neur- 
inger, 1992; Odum, et al, 2006; Wagner & 
Neuringer, 2006). These studies also bring 
some fruitful knowledge. Whereas our exper- 
iment showed the runs-test contingency effects 
on sequential dependencies, studies that re- 
veal the effect of sequential patterns on 
complex behavioral phenomena remain for 
the future. 
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